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GENEALOGICAL PARTICLE ANALYSIS OF RARE EVENTS 
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Universite de Nice and Universite Paris 7 

In this paper an original interacting particle system approach is 
developed for studying Markov chains in rare event regimes. The pro- 
posed particle system is theoretically studied through a genealogical 
tree interpretation of Feynman-Kac path measures. The algorithmic 
implementation of the particle system is presented. An estimator for 
the probability of occurrence of a rare event is proposed and its vari- 
ance is computed, which allows to compare and to optimize different 
versions of the algorithm. Applications and numerical implementa- 
tions are discussed. First, we apply the particle system technique to 
a toy model (a Gaussian random walk), which permits to illustrate 
the theoretical predictions. Second, we address a physically relevant 
problem consisting in the estimation of the outage probability due to 
polarization-mode dispersion in optical fibers. 

1. Introduction. The simulation of rare events has become an exten- 
sively studied subject in queueing and reliability models [16], in particular 
in telecommunication systems. The rare events of interest are long waiting 
times or buffer overflows in queueing systems, and system failure events in 
reliability models. The issue is usually the estimation of the probability of 
occurrence of the rare event, and we shall focus mainly on that point. But 
our method will be shown to be also efficient for the analysis of the cascade 
of events leading to the rare event, in order to exhibit the typical physical 
path that the system uses to achieve the rare event. 

Standard Monte Carlo (MC) simulations are usually prohibited in these 
situations because very few (or even zero) simulations will achieve the rare 
event. The general approach to speeding up such simulations is to accelerate 
the occurrence of the rare events by using importance sampling (IS) [16, 
24]. More refined sampling importance resampling (SIR) and closely related 
sequential Monte Carlo methods (SMC) can also be found in [4, 10]. In 



Received May 2004; revised April 2005. 

AMS 2000 subject classifications. 65C35, 65C20, 60F10, 68U20, 62P35. 
Key words and phrases. Rare events, Monte Carlo Markov chains, importance sam- 
pling, interacting particle systems, genetic algorithms. 

This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Applied Probability, 
2005, Vol. 15, No. 4, 2496-2534. This reprint differs from the original in 
pagination and typographic detail. 



1 



2 



P. DEL MORAL AND J. GARNIER 



all of these well-known methods the system is simulated using a new set 
of input probability distributions, and unbiased estimates are recovered by 
multiplying the simulation output by a likelihood ratio. In SIR and SMC 
these ratio weights are also interpreted as birth rates. The tricky part of 
these Monte Carlo strategies is to properly choose the twisted distribution. 
The user is expected to guess a more or less correct twisted distribution; 
otherwise these algorithms may completely fail. Our aim is to propose a 
more elaborate and adaptative scheme that does not require any operation 
of the user. 

Recently intensive calculations with huge numerical codes have been car- 
ried out to estimate the probabilities of rare events. We shall present a 
typical case where the probability of failure of an optical transmission sys- 
tem is estimated. The outputs of these complicated systems result from the 
interplay of many different random inputs and the users have no idea of the 
twisted distributions that should be used to favor the rare event. This is in 
fact one of the main practical issues to identify the typical conjunction of 
events leading to an accident. Furthermore these systems are so complicated 
that it is very difficult for the user, if not impossible, to modify the codes 
in order to twist the input probability distributions. We have developed a 
method that does not require twisting the input probability distribution. 
The method consists in simulating an interacting particle system (IPS) with 
selection and mutation steps. The mutation steps only use the unbiased 
input probability distributions of the original system. 

The interacting particle methodology presented in this paper is also closely 
related to a class of Monte Carlo acceptance/rejection simulation techniques 
used in physics and biology. These methods were first designed in the 1950s 
to estimate particle energy transmission [15] , self-avoiding random walks and 
macromolecule evolutions [23] . The application model areas of these particle 
methods now have a range going from advanced signal processing, includ- 
ing speech recognition, tracking and filtering, to financial mathematics and 
telecommunication [10]. 

The idea is the following one. Consider an i?-valued Markov chain (Xp)o<p< 
with nonhomogeneous transition kernels Kp. The problem consists in esti- 
mating the probability of occurrence Pa of a rare event of the form {V{Xn) G 
A} where V is some function from E to M. The IPS consists of a set of N 

particles (Xp*'*)i<j<7v evolving from time p = to p = n. The initial gener- 
ation at p = is a set of independent copies of Xq. The updating from the 
generation p to the generation p + 1 is divided into two stages: 

(1) The selection stage consists in choosing randomly and independently 

particles amongst (Xp'^)i<j<7v according to a weighted Boltzmann- 
Gibbs particle measure, with a weight function that depends on V. Thus, 
particles with low scores are killed, while particles with high scores are 
multiplied. Note that the total number of particles is kept constant. 
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(2) The mutation step consists in mutating independently the particles ac- 
cording to the kernel Kp. Note that the true transition kernel is applied, 
in contrast with IS. 

The description is rough in that the IPS actually acts on the path level. 
The mathematical tricky part consists in proposing an estimator of the prob- 
ability Pa and analyzing its variance. The variance analysis will provide 
useful information for a proper choice of the weight function of the selection 
stage. 

The analysis of the IPS is carried out in the asymptotic framework N ^ 
1 where N is the number of particles, while the number n of mutation- 
selection steps is kept constant. Note that the underlying process can be 
a Markov chain (Xp)o<p<n with a very large number of evolutionary steps 
n. As the variance analysis shows, it can then be more efficient to perform 
selection steps on a subgrid of the natural time scale of the process X . In 
other words, it is convenient to introduce the chain (Xp)o<p<n = iXkp)o<p<n 
where k = h/n and n is in the range 10-100. The underlying process can be 
a time-continuous Markov process (-'^t)tg[o,T] well. In such a situation it 
is convenient to consider the chain (Xp)o<p<n = (-'^pT/n)o<p<n- 

Beside the modeling of a new particle methodology, our main contribution 
is to provide a detailed asymptotic study of particle approximation models. 
Following the analysis of local sampling errors introduced in Chapter 9 in the 
research monograph [4] , we first obtain an asymptotic expansion of the bias 
introduced by the interaction mechanism. We also design an original fluc- 
tuation analysis of polynomial functions of particle random fields, to derive 
new central limit theorems for weighted genealogical tree-based occupation 
measures. The magnitude of the asymptotic variances and comparisons with 
traditional Monte Carlo strategies are discussed in the context of Gaussian 
models. 

Briefly, the paper is organized as follows. Section 2 contains all the the- 
oretical results formulated in an abstract framework. We give a summary 
of the method and present a user-friendly implementation in Section 3. We 
consider a toy model (a Gaussian random walk) in Section 4 to illustrate the 
theoretical predictions on an example where all relevant quantities can be 
explicitly computed. Finally, in Section 5, we apply the method to a physical 
situation emerging in telecommunication. 

2. Simulations of rare events by interacting particle systems. 

2.1. Introduction. In this section we design an original IPS approach for 
analyzing Markov chains evolving in a rare event regime. 

In Section 2.2 we use a natural large deviation perspective to exhibit natu- 
ral changes of reference measures under which the underlying process is more 



4 



P. DEL MORAL AND J. GARNIER 



likely to enter in a given rare level set. This technique is more or less well 
known. It often offers a powerful and elegant strategy for analyzing rare de- 
viation probabilities. Loosely speaking, the twisted distributions associated 
to the deviated process represent the evolution of the original process in the 
rare event regime. In MC Markov chain literature, this changes-of-measure 
strategy is also called the importance sampling (IS) technique. 

In Section 2.3 we present a Feynman-Kac formulation of twisted refer- 
ence path distributions. We examine a pair of Gaussian models for which 
these changes of measures have a nice explicit formulation. In this context, 
we initiate a comparison of the fluctuation-error variances of the "pure" MC 
and the IS techniques. In general, the twisted distribution suggested by the 
physical model is rather complex, and its numerical analysis often requires 
extensive calculations. The practitioners often need to resort to another 
"suboptimal" reference strategy, based on a more refined analysis of the 
physical problem at hand. The main object of this section is to complement 
this IS methodology, by presenting a genetic type particle interpretation of 
a general and abstract class of twisted path models. Instead of hand crafting 
or simplified simulation models, this new particle methodology provides a 
powerful and very flexible way to produce samples according to any com- 
plex twisted measures dictated by the physical properties of the model at 
hand. But, from the strict practical point of view, if there exists already a 
good specialized IS method for a specific rare event problem, then our IPS 
methodology may not be the best tool for that application. 

In Section 2.4 we introduce the reader to a new developing genealogical 
tree interpretation of Feynman-Kac path measures. For a more thorough 
study on this theme we refer to the monograph [4] and references therein. We 
connect this IPS methodology with rare event analysis. Intuitively speaking, 
the ancestral lines associated to these genetic evolution models represent the 
physical ways that the process uses to reach the desired rare level set. 

In the final Section 2.5 we analyze the fluctuations of rare event parti- 
cle simulation models. We discuss the performance of these interpretations 
on a class of warm-up Gaussian models. We compare the asymptotic error- 
variances of genealogical particle models and the more traditional nonin- 
teracting IS schemes. For Gaussian models, we show that the exponential 
fluctuation orders between these two particle simulation strategies are equiv- 
alent. 

2.2. A large deviation perspective. Let Xn be a Markov chain taking 
values at each time n in some measurable state space {En,£n) that may 
depend on the time parameter n. Suppose that we want to estimate the 
probability Pn{a) that Xn enters, at a given fixed date n, into the o-level 
set V~^{\a,co)) of a given energy-like function Vn on En, for some a G M: 
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To avoid some unnecessary technical difficulties, we further assume that 
Pn{a) > 0, and the pair {Xn,Vn) satisfies Cramer's condition E(e'*'^"(^")) < 
oo for all A G M. This condition ensures the exponential decay of the prob- 
abilities F{Vn{Xn) > a) I 0, as a t oo. To see this claim, we simply use the 
exponential version of Chebyshev's inequality to check that, for any A > 
we have 

P(K(^n) > «) < e-^('^-^"'^"W) with A„(A) =^-logE(e^^"(^")). 

As an aside, it is also routine to prove that the maximum of (Aa — A„(A)) with 
respect to the parameter A > is attained at the value A„(o) determined by 
the equation a = E(K(-^n)e'*'^"('^")))/E(e'*'^"(^"))). The resulting inequality 

F{Vn{X„) >a)< e-^"^'^) with A;(a) = sup(Aa - A„(A)) 

A>0 

is known as large deviation inequality. When the Laplace transforms A„ 
are explicitly known, this variational analysis often provides sharp tail es- 
timates. We illustrate this observation on an elementary Gaussian model. 
This warm-up example will be used in several places in the further develop- 
ment of this article. In the subsequent analysis, it is briefly used primarily to 
carry out some variance calculations for natural IS strategies. As we already 
mentioned in the Introduction, and in this Gaussian context, we shall derive 
in Section 2.7 sharp estimates of mean error variances associated to a pair 
of IPS approximation models. 

Suppose that Xn is given by the recursive equation 

(2.2) Xp = Xp„i + Wp 

where Xq = and {Wp)p^n* represents a sequence of independent and identi- 
cally distributed (i.i.d.) Gaussian random variables, with (E(VFi),E(W"^^)) = 
(0, 1). If we take V^(x) = x, then we find that A.„(A) = A^n/2, A„(a) = a/n 
and A* (a) = a? /{2n), from which we recover the well-known sharp exponen- 
tial tails W{Xn >a)< e-"V{2n). 

In more general situations, the analytical expression of A* (a) is out of 
reach, and we need to resort to judicious numerical strategies. The first 
rather crude MC method is to consider the estimate 

N 1 ^ 

i=l 

based on independent copies (X^)i<i<Ar of X„. If is not difficult to check 
that the resulting error- variance is given by 

alia) = Nn{P^{a) - Pn{a)f] = P„(a)(l - P„(a)). 

In practice, {a) is a very poor estimate mainly because the whole sample 
set is very unlikely to reach the rare level. 
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A more judicious choice of MC exploration model is dictated by the large 
deviation analysis presented above. To be more precise, let us suppose that 
a > A~^A„(A), with A > 0. To simplify the presentation, we also assume that 
the initial value Xq = xq is fixed, and we set Vb(xo) = 0. Let be the new 

reference measure on the path space Fn'^={EQ x ••• x En) defined by the 
formula 

(2.3) dpW = ^^p^e^^"(^")<iP„, 

where P„ is the distribution of the original and canonical path (Xp)o<p<n- 
By construction, we have that 

< e-^("-^"'^"(^»Pi^)(K(A„) > a), 

where represents the expectation operator with respect to the distribu- 
tion Pi^\ By definition, the measure pI^^ tends to favor random evolutions 
with high potential values V^(X„). As a consequence, the random paths un- 
der Pn'^^ are much more likely to enter into the rare level set. For instance, 
in the Gaussian example described earlier, we have that 



n 

\2 I 



(2.4) dF^^^ /dFn = n e^(^f 

p=i 

In other words, under Pn^^ the chain takes the form Xp = Xp_i + A -|- Wp, 

and we have Fl^\Xn >a)= Fn{Xn > a — An) (=1/2 as soon as a = An). 
These observations suggest to replace P^{a) by the weighted MC model 



7=1 CLiTn 



associated to N independent copies (X^'*)i<j<7v of the chain under fI^\ 
Observe that the corresponding error-variance is given by 

(2.5) = E[l^„(^„)>, e-^^"(^")]E[e^^"(^")] - P^{a) 

< e-^('^-^"'^"W)p„(a) - 

For judicious choices of A, one expects the exponential large deviation term 
to be proportional to the desired tail probabilities Pn{(i)- In this case, we 
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have ai^\a)'^ < KP^{a) for some constant K. Returning to the Gaussian 
situation, and using Mill's inequalities 

<P(AA(0,l)>t)^e*'/2<i 

which are valid for any i > 0, and any reduced Gaussian random variable 
AA(0, 1) (see, e.g., (6) on page 237 in [25]), we find that 

a^n\a? < e-"'/(2n)p^(^) _ p2(^) < Pl{a)[V2^{a/^+ ^/a) - 1] 

for the value A = A„(a) = a/n which optimizes the large deviation inequality 
(2.6). For typical Gaussian type level indexes a = aQ^/n, with large values 
of ao, we find that Xn{o) = clq/ y/n and 

< (a) [72^(00 + 1/ao) - 1]. 

As an aside, although we shall be using most of the time upper bound 
estimates. Mill's inequalities ensure that most of the Gaussian exponential 
deviations are sharp. 

The formulation (2.5) also suggests a dual interpretation of the variance. 
First, we note that 

and therefore 

fjW(a)2 = P^(^-^)(T/„(X„) > a)E[e^^"(^")]E[e-^^"(^")] - P'^{a). 

In contrast to , the measure Pi now tends to favor low energy states 
Xn- As a consequence, we expect Pi iVniXj^) > a) to be much smaller 
than Pn{a). For instance, in the Gaussian case we have 

^^n^\Xn > a)=¥n{Xn > a + Xn) < e-('^+A")V(2n)_ 
Since we have E[e'^'^"] =E[e"-^'^"] = e'*'^"/^, we can write 

(2.6) (^i^Haf < e-«Vne{a-An)V(2n) _ p2(^) 

which confirms that the optimal choice (giving rise to the minimal variance) 
for the parameter A is X = a/n. 

2.3. Twisted Feynman-Kac path measures. The choice of the "twisted" 
measures Pi^^ introduced in (2.3) is only of pure mathematical interest. In- 
deed, the IS estimates described below will still require both the sampling of 
random paths according to P^"^^ and the computation of the normalizing con- 
stants. As we mentioned in the Introduction, the key difficulty in applying 
IS strategies is to choose the so-called "twisted" reference measures. In the 
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further development of Section 2.4, we shall present a natural genealogical 
tree-based simulation technique of twisted Feynman-Kac path distribution 
of the following form: 



(2.7) 



.p=i 



In the above display, Zn > stands for a normalizing constant, and (Gp)i<p<„ 
represents a given sequence of potential functions on the path spaces (-Fp)i<p<n- 
Note that the twisted measures defined in (2.3) correspond to the (nonunique) 
choice of functions 



(2.8) 



Gp{Xo, 



,X(Vp(Xp)-Vp^-,iXp^r)) 



As an aside, we mention that the optimal choice of twisted measure with re- 
spect to the IS criterion is the one associated to the potential functions G„ = 
ly-iQ^ oo))' ~ ^' p <n. In this case, we have Zn = F{Vn{Xn) > a) 

and Qn is the distribution of the random paths ending in the desired rare 
level. This optimal choice is clearly infeasible, but we note that the resulting 
variance is null. 

The rare event probability admits the following elementary Feynman-Kac 
formulation. 



nVn{Xn)>a)=K 



9rf' (^0, ■ • ■ , Xn) W Gp{Xo, ...,Xp 



ZnQnig^r^ 



with the weighted function defined by 

n 

g^nH^O, ■■■,Xn) = ly„(x„)>a H ^'^(^0^ 



for any path sequence such that 11^=1 Gp{xQ, . . . , Xp) > 0. Otherwise, git^ is 
assumed to be null. 

The discussion given above already shows the improvements one might 
expect in changing the reference exploration measure. The central idea be- 
hind this IS methodology is to choose a twisted probability that mimics 
the physical behavior of the process in the rare event regime. The poten- 
tial functions Gp represent the changes of probability mass, and in some 
sense the physical variations in the evolution of the process to the rare level 
set. For instance, for time-homogeneous models Vp = V, <p <n, the po- 
tential functions defined in (2.8) will tend to favor local transitions that 
increase a given T^-energy function. The large deviation analysis developed 
in Section 2.2 combined with the Feynman-Kac formulation (2.3) gives some 
indications on the way to ch.oos6 the twisted potentia-1 functions ((j'p)i<p<n* 
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Intuitively, the attractive forces induced by a particular choice of poten- 
tials are compensated by increasing normalizing constants. More formally, 
the error-variance of the Q„-importance sampling scheme is given by the 
formula 

(2.9) a^{af = Q-(K(X„) > a)^„^- - P^iaf, 
where Q~ is the path Feynman-Kac measure given by 

dQn=^\f[ g;\Xo, . . . ,Xp)| dP,. 
lp=i J 

Arguing as before, and since Q~ tends to favor random paths with low 
Gp energy, we expect Q~(y„(X„) > a) to be much smaller than the rare 
event probability P(V^(X„) > a). On the other hand, by Jensen's inequal- 
ity we expect the product of normalizing constants ZnZ^ (> 1) to be very 
large. These expectations fail in the "optimal" situation examined above 
{On = l\/"^([a oo))' ^"^^ Gp ~ -^^^ P ^ ''^)- ^^^^ case, we simply note 
that Q„ = Q- = Law{Xo,...,Xn\Vn{Xn) > a), and Q-(K(Xn) > a) = 1, 
and Zn = Z~ = Pn{a). To avoid some unnecessary technical discussions, we 
always implicitly assume that the rare event probabilities -Pn(fl) are strictly 
positive, so that the normalizing constants Zn = Z~ = Pn{a) > are always 
well defined. 

We end this section with a brief discussion on the competition between 
making a rare event more attractive and controlling the normalizing con- 
stants. We return to the Gaussian example examined in (2.2), and instead 
of (2.4), we consider the twisted measure 

(2.10) dQn = dFi^^ = ^ I J] e^^^ [ dP„. 

Zn 



In this case, it is not difficult to check that for any A G M we have zi^^ = 
e ' 2^p=iP . In addition, under Pn the chain X„ has the form 

(2.11) Xp = Xp^i + X{n-p+l) + Wp, l<p<n. 

When A > 0, the rare level set is now very attractive, but the normalizing 
constants can become very large Zn^^ = Zn ^ (> e'^^"^/^^). Also notice that 
in this situation the first term in the right-hand side of (2.9) is given by 

p(-^)(K(Xn)>a)^(')^("') 
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Although we are using inequahties, we recall that these exponential esti- 
mates are sharp. Now, if we take A = 2a/[n{n + 1)], then we find that 

(2.12) p1-^)(K(X„) > a)4^)4-^) < e-(«Vn)(2/3)(i+i/(n+i))^ 

This shows that even if we adjust correctly the parameter A, this IS estimate 
is less efhcient than the one associated to the twisted distribution (2.4). 
The reader has probably noticed that the change of measure defined in 
(2.10) is more adapted to estimate the probability of the rare level sets 
{y-niXn) > cl}, with the historical chain Yn = {Xq, . . . ,X„) and the energy 
function Vn{Yn)=Ep=iXp. 

2.4. A genealogical tree-based interpretation model. The probabilistic in- 
terpretation of the twisted Feynman-Kac measures (2.7) presented in this 
section can be interpreted as a mean field path-particle approximation of 
the distribution fiow (Qn)n>i- We also mention that the genetic type se- 
lection/mutation evolution of the former algorithm can also be seen as an 
acceptance/rejection particle simulation technique. In this connection, and 
as we already mentioned in the Introduction, we again emphasize that this 
IPS methodology is not useful if we already know a specialized and exact 
simulation technique of the desired twisted measure. 

2.4.1. Rare event Feynman-Kac type distributions. To simplify the pre- 
sentation, it is convenient to formulate these models in terms of the historical 
process 

Yn ='-(Xo, .■.,Xn)eFn ='-(^0 X • • • X ^„). 

We let Mn{yn-i,dyn) be the Markov transitions associated to the chain Yn. 
To simplify the presentation, we assume that the initial value Yq = Xq = xq 
is fixed, and we also denote by Kn{xn-i,dxn) the Markov transitions of X„. 
We finally let Bh{E) be the space of all bounded measurable functions on 
some measurable space {E,£), and we equip Bh{E) with the uniform norm. 

We associate to the pair potentials/transitions {Gn,Mn) the Feynman- 
Kac measure defined for any test function /„ G Bi,{Fn) by the formula 



7n(/n)=]E 



fn{Yn) n Gk{Yk) 
l<k<n 



We also introduce the corresponding normalized measure 

?/n(/n) =7n(/n)/7n(l)- 

To simplify the presentation and avoid unnecessary technical discussions, 
we suppose that the potential functions are chosen such that 

sup Gn{yn)/Gn{y'n) < OO. 
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This regularity condition ensures that the normalizing constants 7n(l) and 
the measure 7^ are bounded and positive. This technical assumption clearly 
fails for unbounded or for indicator type potential functions. The Feynman- 
Kac and the particle approximation models developed in this section can 
be extended to more general situations using traditional cut-off techniques, 
by considering Kato-class type of potential functions (see, e.g., [19, 22, 26]), 
or by using different Feynman-Kac representations of the twisted measures 
(see, e.g.. Section 2.5 in [4]). 

In this section we provide a Feynman-Kac formulation of rare event prob- 
abilities. The fluctuation analysis of their genealogical tree interpretations 
will also be described in terms of the distribution flow (7^,7?^), defined as 
{imfln) by replacing the potential functions Gp by their inverse 

Gp = l/Gp. 

The twisted measures Qn presented in (2.7) and the desired rare event 
probabilities have the following Feynman-Kac representation: 

Qn{fn)=r]n{fnGn)hn{Gn) and P(K (X„) > a) = 7„ (T^^^) (1)) . 

In the above displayed formulae, T^\\) is the weighted indicator function 
defined for any path y„ = (xq, . . . , Xn) S -F„ by 

T^-\l){yn)=T^''\l){xo,...,Xn) = lvM>a H (xq, . . . , Xp) . 

l<p<n 

More generally, we have for any ipn £ Bb{Fn) 

n^niXo, . . . ,X„); K(Xn) >a]= ^n{TlC\vn)) 

with the function T^\ipn) given by 

(2.13) T^''\(pn){xQ,...,Xn)=<fn{xo,...,Xn)lVn{x^)>a H (xq, . . . , Xp) . 

l<p<n 

To connect the rare event probabilities with the normalized twisted measures 
we use the fact that 

n 

7n+l(l) = 7n(G'n) = r/„(Gn)7n(l) = H ^pi^p)' 

p=l 

This readily implies that for any /„ G Bh{Fn) 



(2.14) 



ln{fn)=r]n{fn) Vp{Gp). 

l<p<n 
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This yields the formulae 

l<p<n 

(2.15) E{ipn{Xo,...,Xn);Vn{Xn)>a)=rjn{Tt\iPn)) n ^p('^p)' 

l<p<n 

M{MXo,---,Xn)\Vn{Xn)>a)=r,n{Ti-\^n))/Vn{Tt\l)). 

To take the final step, we use the Markov property to check that the 
twisted measures (r/n)n>i satisfy the nonlinear recursive equation 

r]n = ^ni'nn-i)'^= / 7?„_i ((iy„_i)G„_i (y„_i )M„ , •)/?/„_! ) 
starting from rji = Mi(xo, •)• 

2.4.2. Interacting path-particle interpretation. The mean field particle 
model associated with a collection of transformations is a Markov chain 

= {^n)i<i<N taking values at each time n > 1 in the product spaces . 
Loosely speaking, the algorithm will be conducted so that each path-particle 

Cn = iCo,n^Cl,n^ ■ ■ • ) Cn,n) & Fn = {Eq X ■ ■ ■ X £'„) 

is almost sampled according to the twisted measure r/„. 

The initial configuration = {d)i<i<N consists of N independent and 
identically distributed random variables with common distribution 

mi.d{yo,yi)) = Mi(xo,d(yo,yi)) = 6xo{dyo) Ki{yo,dyi). 

In other words, ?i ^='(^0,1' '^1,1) = (^0)Ci,i) ^ Fi = {Eq x Ei) can be inter- 
preted as independent copies xq ^ of the initial elementary transi- 
tion Xq =xq-^ Xi. The elementary transitions ^n-i from FJ^_i into 
F^ are defined by 

TV 

(2.16) P(e„ G d{yl y!^Mn-i) = U a>„(m(en-i))(dy^), 

where m{^n-i) 4r J2iLi , and d{y\, . . . , y^ ) is an infinitesimal neigh- 

borhood of the point (y^,...,y^) G F^ . By the definition of we find 
that (2.16) is the overlapping of simple selection/mutation genetic transi- 
tions 

/- T-tN selection t j-,7V mutation ^ j-^^ 
4n-l G ^n-l * 4n-l G -f'n-l * 4n G . 
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The selection stage consists of choosing randomly and independently N 
path-particles 

Cn-l = (Co,n-li'?l,n-li • • • )^ri-l,n-l) ^ F-n-l 

according to the Boltzmann-Gibbs particle measure 

^n-l(Co,n-H • • • i'^n-l.n-l) ^ 
=1 Z^j'=l "-^n-lV?o,n-l; ■ ■ ■ '?n-l,n-l 



During the mutation stage, each selected path-particle S,n-i is extended by 
an elementary i^„-transition. In other words, we set 

?n (('^0,71' ■ • ■ ' ^n— l,n)' ^n,n) 

= ((Co,n-l) • • • )Cn-l,n-l);Cn,n) ^ Fn = Fn-1 X En, 

where „ is a random variable with distribution Kn{S,n_i •)• The mu- 
tations are performed independently. 

2.4.3. Particle approximation measures. It is of course out of the scope of 
this article to present a full asymptotic analysis of these genealogical particle 
models. We rather refer the interested reader to the recent monograph [4], 
and the references therein. For instance, it is well known that the occupation 
measures of the ancestral lines converge to the desired twisted measures. 
That is, we have with various precision estimates the weak convergence 
result 

1 ^ 

1=1 

In addition, several propagation-of-chaos estimates ensure that the ancestral 
lines ('?o n' ^1 n' • • • 1 '^n n) asymptotically independent and identically dis- 
tributed with common distribution r/„. The asymptotic analysis of regular 
models with unbounded potential functions can be treated using traditional 
cut-off techniques. 

Mimicking (2.14), the unbias particle approximation measures 7^ of the 
unnormalized model 7„ are defined as 

7n^(/n.)= r/„^(/n) JI ^pi^p)- 
l<p<n 

By (2.15), we eventually get the particle approximation of the rare event 
probabilities Pn{a). More precisely, if we let 

(2.18) P„^(a)=7/y(T(")(l))=r?^^(T('^)(l)) \{ r,^{G,), 

l<p<n 
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then we find that P^{a) is an unbiased estimator of -Pn(o) such that 



In addition, by (2.15), the conditional distribution of the process in the rare 
event regime can be estimated using the weighted particle measure 



When no particles have succeeded in reaching the desired level ^([a, cxo)) 

at time n, we have rj^ {Tit\^)) = 0, and therefore rj^ {Tit\^n)) = for any 
ipn G Si){Fn). In this case, we take the convention -P^(a, ipn) = 0. Also notice 
that i]^{Tt\l)) > if and only if we have P^{a) > 0. When P„(a) > 0, 
we have the exponential decay of the probabilities P(P^(a) = 0) — > as A*" 
tends to infinity. 

The above asymptotic, and reassuring, estimates are almost sure conver- 
gence results. Their complete proofs, together with the analysis of extinction 
probabilities, rely on a precise propagation-of-chaos type analysis. They can 
be found in Section 7.4, pages 239-241, and Theorem 7.4.1, page 232 in [4]. 
In Section 2.5 we provide a natural and simple proof of the consistency of 
the particle measures (7^,??^) using an original fluctuation analysis. 

In our context, these almost sure convergence results show that the ge- 
nealogical tree-based approximation schemes of rare event probabilities are 
consistent. Unfortunately, the rather crude estimates say little, as much as 
more naive numerical methods do converge as well. Therefore, we need to 
work harder to analyze the precise asymptotic bias and the fluctuations of 
the occupation measures of the complete and weighted genealogical tree. 
These questions, as well as comparisons of the asymptotic variances, are 
addressed in the next three sections. 

We can already mention that the consistency results discussed above will 
be pivotal in the more refined analysis of particle random fields. They will 
be used in the further development of Section 2.5, in conjunction with a 
semigroup technique, to derive central limit theorems for particle random 
fields. 

2.5. Fluctuations analysis. The fluctuations of genetic type particle mod- 
els have been initiated in [5]. Under appropriate regularity conditions on 
the mutation transitions, this study provides a central limit theorem for the 
path-particle model {^q, . . . , (,n)i<i<N- Several extensions, including Donsker's 
type theorems, Berry-Esseen inequalities and applications to nonlinear fil- 
tering problems can be found in [6, 7, 8, 9]. In this section we design a 



(2.19) 




a.s. 



(2.20) 



P^{a,ipy=nMXo,...,Xn)\Vn{Xn)>a]. 
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simplified analysis essentially based on the fluctuations of random fields as- 
sociated to the local sampling errors. In this subsection we provide a brief 
discussion on the fluctuations analysis of the weighted particle measures in- 
troduced in Section 2.4. We underline several interpretations of the central 
limit variances in terms of twisted Feynman-Kac measures. In the final part 
of this section we illustrate these general and theoretical fluctuations analy- 
ses in the warm-up Gaussian situation discussed in (2.2), (2.4) and (2.10). In 
this context, we derive an explicit description of the error- variances, and we 
compare the performance of the IPS methodology with the noninteracting 
IS technique. 

The fluctuations of the mean field particle models described in Section 2.4 
are essentially based on the asymptotic analysis of the local sampling errors 
associated with the particle approximation sampling steps. These local errors 
are defined in terms of the random fields W^, given for any /„ G Bf,{Fn) by 
the formula 



Wn(/n) = ViV W: -'^>niVn-imn). 

The next central limit theorem is pivotal ([4], Theorem 9.3.1, page 295). For 
any fixed time horizon n > 1, the sequence (Wp)i<p<n converges in law, as 
N tends to infinity, to a sequence of n independent, Gaussian and centered 
random fields (Wp)i<p<n; with, for any fp,gp G Bb{Fp), and I <p<n, 

nWp{fp)Wp{gp)] = rjpilfp - Vp{fp)][gp - Vp{9p)])- 

Let Qp^m 1 < P < be the Feynman-Kac semigroup associated to the dis- 
tribution flow (7p)i<p<n- For p = n it is defined by Qn,n = Id, and for p < n 
it has the following functional representation: 



Qp,n{fn){yp) =IE 



UYn) n Gu{Yi,)\Yp = yp 

p<k<n 



for any test function /„ G Bh{Fn), and any path sequence Up = {xq, . . . , Xp) € 
Fp. The semigroup Qp^n satisfies 

(2.21) Vl<p<n ln = 'lpQp,n- 

To check this assertion, we note that 



7n(/n)=E 



fn{Yn) n Gk{Yk 



l<k<n 



n Gfc(y,) 

l<A:<p 



MUYn) n Gkm\iYo,---,Yp) 

\ p<k<n 
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for any /„ G Bb{En). Using the Markov property we conclude that 

7n(/n)=E 



n GkiYk) 

l<k<p 



n Gk{Yk) 

LLl<fc<p 



rnUYn) n Gk{Yk)\Yp 

\ p<k<n J 



Qp,n{fn){Yp) 



■ ^pQp,n{,fn) 



which estabhshes (2.21). To explain what we have in mind when making 
these definitions, we now consider the elementary telescopic decomposition 



■ In = ^bpQp,n - 7^i(3p-l,n]- 
p=l 



For p = 1, we recall that rjQ = and 71 = 771 = Mi(xo, •)) from which we 
find that rj^ Qo^n = 7iQi,n = 7n- Using the fact that 



lp~iQp-i,P = 7p-i(G'p-i) X $p_i(7?;"„i) and 7p-i(Gp_i) = jp (1) 
the above decomposition implies that 

n 

(2.22) W;i'''ifn)=-VN[j:!-7n]{fn) = Y.y^im^iQp,nfn). 



N 



N , 



p=l 



Lemma 2.1. 7^ is an unbiased estimate of'jn, in the sense that for any 
p>l and fn G Bb{Fn), with \\fn\\ < 1, we have 

IE(7n(/n))=7n(/n) and sup v^E[|7^ (/„) - 7„(/„) H ^Z^' < Cp(n) 

7V>1 

for some constant Cp{n) < 00 whose value does not depend on the function 

fn. 

Proof. We first notice that (7^ (l))„>i is a predictable sequence, in 
the sense that 

/n-l \ n-l 

E{^:^mo,...,^n-i)=nUVpiGpMo,...,Cn-i] = l[VpiGp)=l^il). 

\p=l ) p=l 

On the other hand, by definition of the particle scheme, for any 1 < p < n, 
we also have that 

IE(Wp''(Qp,n/n)|eO,...,ep-l) 

= %/iVE([r?^ - ^piri^.MQpMM^^ ■ ■ • = 0. 
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Combining these two observations, we find that 

{Qp,nfnMo, . ■ .,Cp-l) = 0. 

This yields that 7^ is unbiased. In the same way, using the fact that the 
potential functions are bounded, we have for any p>l and fn G 13i,{Fn), 

with ll/nll <1, 

n 
k=l 

for some constant ai{k) < 00 which only depends on the time parameter. 
We recall that r]^ is the empirical measure associated with a collection of 
N conditionally independent particles with common law ^p{r]p_i). The end 
of the proof is now a consequence of Burkholder's inequality. □ 

Lemma 2.1 shows that the random sequence (7^ (l))i<p<n converges in 
probability, as tends to infinity, to the deterministic sequence (7p(l))i<p<n- 
An application of Slutsky's lemma now implies that the random fields W^'^ 
converge in law, as N tends to infinity, to the Gaussian random fields 
defined for any /„ G Bb{Fn) by 

n 

(2.23) W;i(/„) = E7p(l)>Vp(Qp,n/n). 

p=l 

In much the same way, the sequence of random fields 
(2.24) 

converges in law, as N tends to infinity, to the Gaussian random fields 
defined for any /„ E Bh{Fn) by 



(2.25) 



W^(/n)=W;[(^(/„-r/„(/„)) 

The key decomposition (2.24) also appears to be useful to obtain Lp- 
mean error bounds. More precisely, recalling that 7„(1)/7^(1) is a uni- 
formly bounded sequence w.r.t. the population parameter iV > 1, and using 
Lemma 2.1, we prove the following result. 
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Lemma 2.2. For any p > 1 and fn G Bb{Fn), with \\fn\\ ^ 1; we have 

sup yiVE[|r?^(/„) - VnifnW]'^^ < Cp(n) 
Af>l 

for some constant Cp{n) < oo whose value does not depend on the function 

fn- 

A consequence of the above fluctuations is a central limit theorem for the 
estimators Pn {o) and P^{a^ct)n) introduced in (2.18) and (2.20). 

Theorem 2.3. The estimator {a) given by (2.18) is unbiased, and 
it satisfies the central limit theorem 

(2.26) Vn[P.!^ {a) - Pn{a)] M{Q,a<{af) 
with the asymptotic variance 

(2.27) al{af = Y,bp{l)% {Pp,n{af) - PrA^f] 

p=i 

and the collection of functions Pp^n{o) defined by 

(2.28) XpdEp^ Pp,n{a){xp) = E[lv;^(^„)>jXp = Xp] G [0, 1]. 

In addition, for any (fn G Bb{Fn), the estimator P^{a,(pn) given by (2.20) 
satisfies the central limit theorem 

(2.29) ^[Pn{a, ipn) - Pn{a, ipn)] AA(0, an{a, ipnf) 
with the asymptotic variance 

n 

(2.30) a„(a,^„)2 =P„(a)-2^7p(l)7p-(l)r7;(Pp,„(a,^„)2) 

p=i 

and the collection of functions Pp^n{o-,'-Pn) £ Bb{Fp) defined by 

Pp,n{a, iPn){xo, ...,Xp)= E[((y9„(Xo, . . . , X„) - P„(a, '^n))'i-V„(X„)>a\ 

(2.31) 

{Xo,...,Xp) = {xo,...,Xp)]. 

Proof. We flrst notice that 
(2.32) ^ [P„^(a) - P„(a)] = W;i'^(rW(l)) 

with the weighted function Tn"'\l) introduced in (2.13). If we take fn = 
Tit\l) in (2.22) and (2.23), then we find that W^'^ {Tt\l)) converges 
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in law, as N tends to infinity, to a centered Gaussian random variable 
W2{Ti"'\l)) with the variance 



= Y.^pii)%i[QpATicH^))-vpQpAT!cHm')- 
p=i 

To have a more explicit description of o'J^(a) we notice that 



n Gk{xo,...,XkyAF{Vn{Xn)>a\Xp 

l<k<p ) 



By definition of r^p, we also find that 

r^,{Q,,n{TlC\l)))=¥{Vn[Xn) > a)/7p(l). 
From these observations, we conclude that 



(2.33) 



p=i I 



n G'fc(^0,---,^fc)IE(ly„(x„)>al^p 
Ll<fc<p 



This variance can be rewritten in terms of the distribution flow (r/^ )i<p<n, 
since we have 



7p-(/p)=IE 



n G-(Xo,...,Xfc) 

.i<fc<p 



X fp{Xp) 



Vp ifp) X 7p (1) 



for any fp G Bb{Ep), and any 1 <p<n. Substituting into (2.33) yields (2.27). 

Our next objective is to analyze the fluctuations of the particle conditional 
distributions of the process in the rare event regime defined in (2.20): 



{ipn-Pn{a,ipn)) 



Using the same arguments as above, one proves that the sequence of random 
variabk 
letting 



variables "^^ ' converges in probability, as — > oo, to 1. Therefore, 



fn 



InTn (1) 
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in (2.24) and (2.25), and applying again Slutsky's lemma, we have the weak 
convergence 

V -iV \ ' 

(2.34) 

^r?„rr(i) 

The limit is a centered Gaussian random variable with the variance 



N[Pn {a,ipn) - Pn(a,9J„)]lpAr(„)>o 



(T„(a,99„)^'^='E 



VnTril) J 



Taking into account the definition of and the identities 'qnT^\\) 
Pn{a) hn{'^) and r]nTn\'^n - Pn{a, y^n)) = 0, we obtain 

n 

(2.35) an{a, ifnf = Pn{a)-^ E Jpi^hpiiQpATtKv^n - Pnia, ^n)))?)- 

p=l 

To derive (2.30) from (2.35), we notice that 
(a,v3„,)))(xo,... 



( n G(Xo,...,Xfe))l^.„(^„)>J n G-iXo,...,xM 

\p<k<n / \l<k<n / 

X {ipn{Xo, . . .,Xn) - Pnia,ipn))\{Xo,. . . , Xp) = {xq, ...,Xp) 

= I n G~{xo,.. .,Xk) j 
\l<fe<p / 

X E[{ipn{Xo,...,Xn) -P„ (a, (/?„)) 

X^V„iX„)>a\{Xo,...,Xp) = {xo,...,Xp)] 

= n G~{xo,...,Xk)\ y- Pp^nia,(Pn)ixo,...,Xp) 
\l<k<p ) 

from which we find that 

lpmp,n(Tl^\^n-Pn[a,^n)))f) 

= E ( n G-(Xo,...,Xfc)) xPp,„(a,(^„)(Xo,...,Xp) 

= 7p (-Pp,n(a,¥'n)^) =7p (1) X np{Pp,n{a,^nf). 



□ 
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Arguing as in the end of Section 2.2, we note that the measures rj~ tend 
to favor random paths with low {Gk)i<k<p potential values. Recalling that 
these potentials are chosen so as to represent the process evolution in the 
rare level set, we expect the quantities r/~(Pp^„(a)^) to be much smaller than 
P„(a). In the reverse angle, by Jensen's inequality we expect the normalizing 
constants products 7^(1)7" (1) to be rather large. We shall make precise 
these intuitive comments in the next section, with explicit calculations for 
the pair Gaussian models introduced in (2.4) and (2.10). We end the section 
by noting that 

n 

an{a,ipnf < P„(a)-2^7p(l)7-(l)r?-(Pp,„(a)2) 
p=i 

for any test function ipn, with sup(y^ y/^)g^^2 \ipniyn) - ^niv'Jl < 1- 

2.6. On the weak negligible bias of genealogical models. In this subsection 
we complete the fluctuation analysis developed in Section 2.5 with the study 
of the bias of the genealogical tree occupation measures rj^ , and the corre- 
sponding weighted measures Pj^{a,ipn) defined by (2.20). The forthcoming 
analysis also provides sharp estimates, and a precise asymptotic description 
of the law of a given particle ancestral line. In this sense, this study also com- 
pletes the propagation-of-chaos analysis developed in [4] . The next technical 
lemma is pivotal in our way to analyze the bias of the path-particle models. 

Lemma 2.4. For any n,d> 1, any collection of functions {fn)i<i<d £ 
Bh{FnY o.'^d any sequence {v^)i<i<d S {liV}'^; random products 

nf=i W^''^(/* ) converge in law, as N tends to infinity, to the Gaussian 
products Ylf^iWnifn)- addition, we have 

i=l 

Proof. We first recall from Lemmas 2.1 and 2.2 that, for G {Tj??} and 
for any /„ G Bh{Fn) and p > 1, we have the Lp-mean error estimates 

supE[|W;;'^(/„)ni/^'<oo 

N>1 

with the random fields (WJ'^^W^'^) defined in (2.22) and (2.24). By the 
Borel-Cantelli lemma this property ensures that 

hnifn),Vn (/n)) {ln{fn),Vn{fn)) a.S. 



lim E 



N- 
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By the definitions of tlie random fields ,W2) and given 

in (2.22), (2.23), and in (2.24), (2.25), we have that 

n n 
>V„^'^(/n) = E<;n<(/p%) W„^(/„) = E<„W,(/;,J 



with 



p=i 



C^;: = 7.^(1) 7n(l)/7n^(l)"^°^c'' 



p=l 



and the pair of functions (/jj„, f^^n) defined by 

fp,n~Qp,'n{fn)i fp,n~Qp,i 



7n(l) 



7p(l), 

{fn-Vn{fn)) )• 



With this notation, we find that 



n<'''(/n)= E 



pi,...,Pd=l li=l 



n 



li=l 



n<(/n)= E 



i=l 



pi,...,Pd=l Li=l 



'Pi,n 



li=l 



Recalhng that the sequence of random fields {Wp)i<p<n converges in law, as 
N tends to infinity, to a sequence of n independent, Gaussian and centered 
random fields (Wp)i<p<n, one concludes that HiLi Vy^'''^(/^) converges in 
law, as N tends to infinity, to YYi=i ifn)- This ends the proof of the first 
assertion. 

Using Holder's inequality, we can also prove that any polynomial function 
of terms 

W;:'^(/„), i^G{7,r?},/„G^;,(F„) 

forms a uniformly integrable collection of random variables, indexed by the 
size and precision parameter N >1. This property, combined with the con- 
tinuous mapping theorem, and the Skorohod embedding theorem, completes 
the proof of the lemma. □ 

We first present an elementary consequence of Lemma 2.4. We first rewrite 
(2.24) as follows 

1 



w^''^(/n) = w;i' 



N 



+ 



7n(l) 

7n(l) 
7^(1) 



ifn-Vnifn)) 



.N 



7n(l) 



ifn - Vnifn)) 
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with the pair of functions (/n,,5n.) defined by 



fn 



1 



7n(l) 



ifn-Vn{fn)) and 5„ 



7n(l) 



This yields that 

7VE[7?^(/„)-r?„(/„)] = -E 



7^(1) 



Since the sequence of random variables (7n(l)/7,^ {^))n>i is uniformly bounded, 
and it converges in law to 1, as tends to infinity, by Lemma 2.4 we con- 
clude that 



(2.36) 



hm new: ifn) - Vnifn)] = -E[WZ{fn)W2{9n)] 
JV— >oo 



- J2 <3p,n(l)<5p,n(/n " Vnifn))) 
p=l 



where the renormalized semigroup Qp „ is defined by 

-7^ \ _ Qp,n{fn) _ 7p(l) ^ 

ripQp,nW 7n(l) 

We are now in position to state and prove the main result of this subsection. 



Theorem 2.5. For any n > 1 and ipn G Bi,{Fn), we have 

NE[{P^{a,^n) - Pn{a,^n))lpN(a) 



'{a)>Oj 
n 

-P„(a)"2 7p{l)7p{l)Vp [^'p,n(a)-Pp,n(a, (/?„) 
p=i 



with the collection of functions Pp^„(a), Pp^nici,^n) ^Bb{Fp) defined, respec- 
tively, in (2.28) and (2.31). 

Proof. The proof is essentially based on a judicious way to rewrite 
(2.34). If we define 



(a) 



rpy 

J-n 



(a) 



VnTk^\l) 



{Lpn- Pn{a,ipn)) and g^^'^ 



Tit\l) 

VnTt\l) 
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then, on the event {P^ (a) > 0}, we have 

N[P^{a,ipn)-Pnia,^n)] 

Vnidn') 

By Lemma 2.4 and (2.36) we conclude that 
iVE[(P„^(a, - P„(a, ^n))lpMia)>o] 



-E[W;?(/('^))W^(5l'^))]-E 



W2 



(a) 



7n(l)y "V7n(l) 



1 



On the other hand, using (2.25) we find that 



j2w)/in{i)fn^p{QpAft^))^piQpA9t^ - m 



p=i 

n 



= T.i^pil)/ln{l))\{Qp,n{ft^)QpA9k^^ - 1)). 
p=l 

Similarly, by (2.23) we have 



E 



W2 



Jn 



(a) 



yv2 



7n,(l); "V7n(l) 



1 



p=l 

It is now convenient to notice that 

„(a) 
Jn 



(a) 



^^p{Qp,n^^]Wp{Qp,n 



7n(l) 



7n(l) 



E 



p I ^p,n 



7n(l), 

7„(l)-2E[Wp(gp,„(/('^)))>Vp(Qp,„(l))] 

7,(l)-2 X r^p{Qp,n{ft^)QpA'^))- 



This implies that 



E 



Jn 



(a) 



W2 



1 



:E(7p(l)/7n(l))\(Qp,n(l)Qp,n(/("))) 
p=l 

from which we conclude that 
NK(\P^' 

(2.37) 

-j2Ml)hnil))\iQpAft^)QpA9^n^))- 
p=l 



"V7n(l); "V7n(l), 

we conclude that 

NE{[P^{a,^^) - Pn{a,ipn)]lpN^,)^o) 



N^oo 
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By the definition of tlie function Tn\'fn) we have rinT^\l) =Pn{a)/^n{^) 
and for any = {xq, ...,Xp)£Fp 

Qp,n{Tn\'Pn)){xo, . . . ,Xp) 
l<k<p 

X ¥\ipn{XQ, . . . ,Xn)lVr,{Xr,)>a\{Xo, . . . ,Xp) = [xq, . . . , Xp)]. 

By the definition of the pair of functions {fn'\gn^), these observations yield 



Qp,n{fn^)ixo, ■ ■ ■ ,Xp) — p / N 



n G^{xo,...,Xk) 

.l<k<p 

X Pp^n{a,<fn){xo, - ■ ■ ,Xp), 



Qp,n{9n^){xo, ■ ■ ■ ,Xp) — p / N 



Y[ G^{xo,...,Xk) 

.l<k<p 



Pp^nia){xo, ...,Xp). 



To take the final step, we notice that 

Qp,n{ft^)ixo^---^Xp) >^ QpA9t^)ixo,---,Xp) X (P„(a)/7„(1))^ 



n G*;, (3;o,...,Xfc) 

l<A:<p 



Pp,n{a, Vn)ixo, ■ ■■,Xp)Pp^n{a)ixo, ■ ■ .,Xp). 



This implies that 

7p{Qp,n{ft^)QpA9t'>)) X (Pn(a)/7n(l))' 



.i<fc<p 



X Pp,n{a, fn){Xo, ■ ■ ■ ,Xp)Pp^n{0'){Xo, . . ■ , Xp) 



= lp {Pp,n{'^,^n)Pp,n{")) =lp (1) X % (Pp,„(a, (/3„)Pp_„(a)). 

Using the identity 7p = ^p[\)r]p and substituting the last equation into (2.37) 
completes the proof. □ 

2.7. Variance coTTipavisoTis foT Gaussian particle vfiodcls. Let (^p)i<p<n 
be the Gaussian sequence defined in (2.2). We consider the elementary 
energy-like function Vn{x) = x, and the Feynman-Kac twisted models as- 
sociated to the potential functions 

(2.38) Gp{xQ, . . . , Xp) = exp[A(xp — Xp^i)] for some A > 0. 
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Arguing as in (2.4), we prove that the Feynman-Kac distribution is the 
path distribution of the chain defined by the recursion 

X- =X-_^ + Wp and =Xj;_^-X + Wk, l<k<p, 

where Xq = 0, and where {Wk)i<k<p represents a sequence of independent 
and identically distributed Gaussian random variables, with (E(VFi), E(W^^)) = 
(0, 1). We also observe that in this case we have 

(2.39) jp{lh;{l)=E[e^''^--]^ = e^'^P-'\ 

The next lemma is instrumental for estimating the quantities r]~ {Pp^n{cL)'^) 
introduced in (2.27). 

Lemma 2.6. Let {Wi,W2) be a pair of independent Gaussian random 
variables, with (E(H^j), E(Wf )) = {0,af), with CTj > and i = 1,2. Then, for 
any a > 0, we have the exponential estimate 

C(a, ai, CJ2) < K[¥{Wi + W2> a\Wif] exp( ^ f < 1, 



2a\ + gI 



where 



Proof. Using exponential version of Chebyshev's inequality we first 
check that, for any A > 0, we have 



F{Wi + W2> a\Wi) < e^^^i-"^) E{e^^^) = e^(^i-'^)+^''^i 



/2 



Integrating the random variable Wi and choosing A = a/(2crf + cr2), we es- 
tablish the upper bound 

E[FiWi +W2> a\Wif] < e-2A«+^'(2-?+-i) = ^-ayi2aj+al)^ 

For any e £ (0, 1), we have 

E[P(l^i + W2> a\Wif] > F{W2 > eafF{Wi > (1 - e)a). 
Applying Mill's inequality yields 

E[F{Wi + W2> a\Wif] > ^ 



(ea/da + cJ2/(ea))2((l - e)a/ai + cJi/((l - e)a)) 
Choosing e = a2/{2ai + cjI) establishes the lower bound. □ 
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From previous considerations, we notice that 

r/-(Pp,„(a)2) = E[P(W^i + W2>{a + \{p - l))\Wi)\ 

where (W^i,VF2) are a pair of independent and centered Gaussian random 
variables, with (E(VFf ),E(W2^)) = {p,n-p). Lemma 2.6 now impUes that 

(2.40) 7?-(Pp,„(a)2) <exp[-(a + A(p- l))V(n+p)]. 

Substituting the estimates (2.39) and (2.40) into (2.27), we find that 

n 

al{af < ^[e^^(f-i)-{«+A(P-i))V(n+P) _ p^^af] 
v=\ 

= I] [e-«''/"e(P+i)/("("+P+i))['^-^("P)/(P+i)l'+^V(p+i) _p^(^)2j^ 

0<p<n 

For \ = a/n, this yields that 

0<p<n 

(2.41) 

<^e-(aVn)(l-l/n)_p^(^)2)_ 

We find that this estimate has the same exponential decay as the one ob- 
tained in (2.6) for the corresponding noninteracting IS model. The only 
difference between these two asymptotic variances comes from the multipli- 
cation parameter n. This additional term can be interpreted as the number 
of interactions used in the construction of the genealogical tree simulation 
model. We can compare the efficiencies of the IPS strategy and the usual 
MC strategy which are two methods that do not require to twist the in- 
put probability distribution, in contrast to IS. The IPS provides a variance 
reduction by a factor of the order of nP„(a). In practice the number n of 
selection steps is of the order of ten or a few tens, while the goal is the 
estimation of a probability Pn{o) of the order of 10~^-10~^'^. The gain is 
thus very significant, as we shall see in the numerical applications. 

Now, we consider the Feynman-Kac twisted models associated to the 
potential functions 

(2.42) Gp(xo, . . . , Xp) = exp(A3;p) for some A > 0. 

Arguing as in (2.11), we prove that r]~ is the distribution of the Markov 
chain 

X- = X-_-^ + Wp and =X^_-^-\{p-k) + Wk, l<k<p, 

where Xq = 0, and where {Wk)i<k<p represents a sequence of independent 
and identically distributed Gaussian random variables, with {K{Wi) ,K{Wi)) = 
(0, 1). We also notice that 

(2.43) 7p(l)7-(l)=E[e^^i<^<p^'=]2 = e^'^i<*<p^'. 
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In this situation, we observe that 



P(VFi + W2>a + A ^l^i) 

. V l<A;<p / 



where (VFi,W2) are a pair of independent and centered Gaussian random 
variables, with (E(VF^), E(Ty2 )) = {Pi^~P)- As before. Lemma 2.6 now im- 
phes that 



(2.44) 



% (PpA^-f) ^ exp 



1 



n + p 



a + X 



Pip -I] 



2i 



Using the estimates (2.43) and (2.44), and recalling that J2i<k<n = '^("' + 
l)(2n + l)/6, we conclude that 

n 

Cl{af < ^[e(V6)A2(p-l)p{2p-l)-{a+Ap{p-l)/2)2/(n+p) _ p^(a)2] 
p=l 
n 

= ^[e-(«V")eP/("(n+P))["-An(p-l)/2]2 + (l/12)A2(p-l)p(p+l) _ p^(a)2j_ 
p=l 

If we take A = 2a/[n{n — 1)], then we get 

n 

p=l 

n 



p=l 



with the increasing function 9:e G [0, 1] — > 9{e) = e-^^^j^ + ^ G [0, 1/3]. 
From these observations, we deduce the estimate 



(2.45) 



aZiaf < n[e-(-Vn){2/3)(i-i/(n-i)) _ ^^(^^2]^ 



Note that the inequalities are sharp in the exponential sense by the lower 
bound obtained in Lemma 2.6. Accordingly we get that the asymptotic 
variance is not of the order of Pn{a)'^, but rather P„(a)^/^. As in the first 
Gaussian example, we observe that this estimate has the same exponential 
decays as the one obtained in (2.12) for the corresponding IS algorithm. 
But, once again, the advantage of the IPS method compared to IS is that it 
does not require to twist the original transition probabilities, which makes 
the IPS strategy much easier to implement. 
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Conclusion. The comparison of the variances (2.41) and (2.45) shows 
that the variance of the estimator P^[a) is much smaher when the potential 
(2.38) is used rather than the potential (2.42). We thus get the important 
qualitative conclusion that it is not efficient to select the "best" particles 
(i.e., those with the highest energy values), but it is much more efficient to 
select amongst the particles with the best energy increments. This conclusion 
is also an a posteriori justification of the necessity to carry out a path-space 
analysis, and not only a state-space analysis. The latter one is simpler but 
it cannot consider potentials of the form (2.38) that turn out to be more 
efficient. 

3. Estimation of the tail of a probability density function. We collect 
and sum up the general results presented in Section 2 and we apply them to 
propose an estimator for the tail of the probability density function (p.d.f.) 
of a real- valued function of a Markov chain. We consider an {E,£)-valued 
Markov chain (Xp)o<p<n with nonhomogeneous transition kernels Kp. In a 
first time, we show how the results obtained in the previous section allow us 
to estimate the probability of a rare event of the form {V{Xn) G A}: 



where V is some function from E to M. We shall construct an estimator 
based on an IPS. As pointed out in the previous section, the quality of 
the estimator depends on the choice on the weight function. The weight 
function should fulfill two conditions. First, it should favor the occurrence 
of the rare event without involving too large normalizing constants. Second, 
it should give rise to an algorithm that can be easily implemented. Indeed 
the implementation of the IPS with an arbitrary weight function requires 
recording the complete set of path-particles. If N particles are generated 
and time runs from to n, this set has size (n-|- 1) x x dim(£') which may 
exceed the memory capacity of the computer. The weight function should 
be chosen so that only a smaller set needs to be recorded to compute the 
estimator of the probability of occurrence of the rare event. We shall examine 
two weight functions and the two corresponding algorithms that fulfill both 
conditions. 

Algorithm 1. Let us fix some /? > 0. The first algorithm is built with 
the weight function 



(3.1) 



Pa = nV{Xn) eA) = E[lA{V{Xn))] 



(3.2) 



G^ix)=exp[pV{xp)]. 



The practical implementation of the IPS reads as follows. 
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" (i) 

Initialization. We start with a set of N i.i.d. initial conditions Xq , 
1 < i < N , chosen according to the initial distribution of Xq. This set is com- 
plemented with a set of weig hts Iq*^ = 1, 1 < i < N. This forms a set of N 
particles: {Xq\Yq^^), l<i< N, where a particle is a pair {X,Y) ^ E x M+. 

Now, assume that we have a set of N particles at time p denoted by 

i<i<iv. 

Selection. We first compute the normalizing constant 

1 ^ 

(3.3) ^A^ = _^exp[/3y(X«)]. 

1=1 

We choose independently N particles according to the empirical distribution 

(3.4) <(dX,dy) = ^ X ^exp[/3y(xW)]5(^„_^„^(dX,dy). 

'P i=l 

The new particles are denoted by {Xp\Yp^^), 1 <i < N. 

Mutation. For every I < i < N, the particle {Xp\Yp^^) is transformed 
into {Xpl^i,Yp2i) by the mutation procedure 

(3.5) XW'H^I^J,, 

where the mutations are performed independently, and 

(3.6) y«i = y«exp[-/3F(X«)]. 

The memory required by the algorithm is A^dim(£^) + N + n, where 
A^dim(£^) is the memory required by the record of the set of particles, is 
the memory required by the record of the set of weights and n is the memory 
required by the record of the normalizing constants f/^ , < p < n — 1. The 
estimator of the probability Pa is then 



(3.7) P 



N 
A 



1 ^ 

-Y.^A{v{n^))Yji' 



i=l 



n-1 

X n 

k=0 



This estimator is unbiased in the sense that E[P^] = Pa- The central limit 
theorem for the estimator states that 

(3.8) ViV(P^ -Pa)""-^ Af{0, Qa) 
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(3.9) 



p=i 



k=0 



Lk=0 



■IE[lA(^n)]' 



Algorithm 2. Let us fix some a > 0. The second algorithm is buih 
with the weight function 



(3.10) 



G^{x) = exp[a(y(xp) - V{xp.i))]. 



Initialization. We start with a set of N i.i.d. initial conditions X, 



(0 

' 



1 < i < A^, chosen according to the initial distribution of Xq. This set is 

" (i) 

complemented with a set of parents Wq = xq, 1 < i < N , where xq is 
an arbitrary point of E with V{xq) = Vq. This forms a set of N particles: 

{W^'\X^'^), l<i<N, where a particle is a pair {W,X) £Ex E. 

Now, assume that we have a set of N particles at time p denoted by 
{Wi'\x^'^), l<i<N. 

Selection. We first compute the normalizing constant 

N 



(3.11) 



N 



l5:exp[a(y(lW)-y(T^»))]. 

i=l 



We choose independently N particles according to the empirical distribution 

N 



(3.12) 



1 

{dW,dX) = ^exp[a(y(X») - m«))] 
^^"p i=i 



The new particles are denoted by (Wp ,Xp '), 1 <i < N. 

Mutation. For every 1 < i < N , the particle {Wp'\Xp^) is transformed 



into {Wp_li, Xp^i) by the mutation procedure Xp 

" (i) 

tations are performed independently, and VFp+i 



^here the mu- 



X, 



The memory required by the algorithm is 2A^dim(i?) +n. The estimator 
of the probability P4 is then 



TV 



(3.13) 



1 ^ 

- ^ lAiVixl^)) exp(-a(F(#«) - Vo)) 



N 



1=1 



n-l 
k=0 
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This estimator is unbiased and satisfies the central Hmit theorem (3.8). 

Let us now focus our attention to the estimation of the p.d.f. tail of V{Xn)- 
The rare event is then of the form {V{Xn) £ [a, a + 5a)} with a large a and 
an evanescent 6a. We assume that the p.d.f. of V{Xn) is continuous so that 
the p.d.f. can be seen as 



p{a) = lim^psaia), 
Sa^o da 

We propose to use the estimator 
(3.14) plia)- 



PSa{a)=F{V{Xn)€[a,a + 5a)). 



with a small 6a. The central limit theorem for the p.d.f. estimator takes the 
form 

(3.15) 



Niplia)-psa{a)) ^^^Mi0,piia,6a)), 



where the variance P2{a,6a) has a limit P2i<i) as 6a goes to which admits 
a simple representation formula: 



(3.16) 



hm -^E 

Sa^O oa 



n—1 
k=0 



E 



n-l 

n Gkix) 

k=0 



Note that all other terms in the sum (3.9) are of order 6a'^ and are therefore 
negligible. This is true as soon as the distribution of V{Xn) given Xp for 
p < n admits a bounded density with respect to the Lebesgue measure. 
Accordingly, the variance P2{a) can be estimated by limsa~to{^0')~^Qi^ a+5a)^ 
^ is given by 

1 ^ 

-5:iA(y(x»))(y»^2 



where is given by 



(3.17) 



N 



n-1 
k=0 



for Algorithm 1, and by 

N 

(3.18) 



1 ^ 



N 



Vo)) 



n-l 
k=0 



for Algorithm 2. The estimators of the variances are important because 
confidence intervals can then be obtained. 

The variance analysis carried out in Section 2.7 predicts that the second 
algorithm [with the potential (3.2)] should give better results than the Algo- 
rithm 1 [with the potential (3.10)]. We are going to illustrate this important 
statement in the following sections devoted to numerical simulations. 

From a practical point of view, it can be interesting to carry out several 
IPS simulations with different values for the parameters f3 (Algorithm 1) 



GENEALOGICAL PARTICLE ANALYSIS OF RARE EVENTS 



33 



and a (Algorithm 2), and also one MC simulation. It is then possible to 
reconstruct the p.d.f. of V{Xn) by the following procedure. Each IPS or MC 
simulation gives an estimation for the p.d.f. p (whose theoretical value does 
not depend on the method) and also an estimation for the ratio p2 /p (whose 
theoretical value depends on the method). We first consider the different 
estimates of a ^ p2/p{a) and detect, for each given value of a, which IPS 
gives the minimal value of p2/p{a). For this value of a, we then use the 
estimation of p{a) obtained with this IPS. This method will be used in 
Section 5. 

4. A toy model. In this section we apply the IPS method to compute 
the probabilities of rare events for a very simple system for which we know 
explicit formulas. The system under consideration is the Gaussian random 
walk Xp+i = Xp + Wp+i, Xq = 0, where the (W^p)p=i,...,n are i.i.d. Gaussian 
random variables with zero mean and variance 1. Let n be some positive 
integer. The goal is to compute the p.d.f. of Xn, and in particular the tail 
corresponding to large positive values. 

We choose the weight function 



The theoretical variance of the p.d.f. estimator can be computed from (3.16): 



When a = 0, we have P2{o.) =p{a), which is the result of standard MC. For 
a 7^ 0, the ratio p2{a)/p{a) is minimal when a = a{n — 1) and then P2{a.) = 
p(a)v'^7mexp(a2(n- l)/(2n)). This means that the IPS with some given a 
is especially relevant for estimating the p.d.f. tail around a = a{n — 1). 

Let us assume that n 3> 1. Typically we look for the p.d.f. tail for a = ao^/n 
with ao > 1 because ^/n is the typical value of Xn. The optimal choice is 
a = ao/v^ and then the relative error is p2{a)/p{a) ~ v^27rn. 

In Figure 1 we compare the results from MC simulations, IPS simulations 
and theoretical formulas with the weight function (4.1). We use a set of 
2 X 10^ particles to estimate the p.d.f. tail of Xn with n = 15. The agree- 
ment shows that we can be confident with the results given by the IPS for 
predicting rare events with probabilities 10~^^. 

We now choose the weight function 




(4.2) 




(4.3) 




(4.4) 



G(^(x) =exp{Pxp). 



34 



P. DEL MORAL AND J. GARNIER 



We get the same results, but the exphcit expression for the theoretical vari- 
ance of the p.d.f. estimator is 

(4.5) p2(a)=p2(a)x V2^exp(^/32^-_^ + A j . 

When /? = 0, we have y^ia) = p{a), which is the result of standard MC. 
For /3 7^ 0, the ratio p2{a)/p{a) is minimal when a = f3n{n — l)/2 and then 
P2{o) = p{a) \/ iTin exp(/5^n(?i^ — l)/24). This means that the IPS with some 
given /3 is especially relevant for estimating the p.d.f. tail around a = — 
l)/2. 

Let us assume that n ^ 1. Typically we look for the p.d.f. tail for a = aQ^fn 
with ao > 1. The optimal choice is /J = 2a{)lr?l'^ and then the relative error 
is P2{a)/p{a) ~ (27rn)-'^/^exp(ag/6) = (27rn)~^/^^p(a)~-^/^. The relative error 
is larger than the one we get with the weight function (4.1). In Figure 2 
we compare the results from MC simulations, IPS simulations and the the- 
oretical formulas with the weight function (4.4). This shows that the weight 
function (4.4) is less efficient than (4.1). Thus the numerical simulations 
confirm the variance comparison carried out in Section 2.7. 

5. Polarization mode dispersion in optical fibers. 

5.1. Introduction. The study of pulse propagation in a fiber with ran- 
dom birefringence has become of great interest for telecommunication appli- 
cations. Recent experiments have shown that polarization mode dispersion 
(PMD) is one of the main limitations on fiber transmission links because 
it can involve significant pulse broadening [13]. PMD has its origin in the 




-30 -ao -10 10 ZO 30 -30 -20 -10 10 20 30 



(a) (b) 

Fig. 1. (a) P.d.f. estimations obtained by the usual MC technique (squares) and by the 
IPS with the weight function (4.1) with ot = l (stars). The solid line stands for the theo- 
retical Gaussian distribution, (b) Empirical and theoretical ratios pijp. 
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Fig. 2. (a) P.d.f. estimations obtained by the usual MC technique (squares) and by the 
IPS with the weight function (4.4) with (5 — 0.15 (stars). The solid line stands for the 
theoretical Gaussian distribution, (b) Empirical and theoretical ratios pilp. 



birefringence [27], that is, the fact that the electric field is a vector field and 
the index of refraction of the medium depends on the polarization state (i.e., 
the unit vector pointing in the direction of the electric vector field) . Random 
birefringence results from variations of the fiber parameters such as the core 
radius or geometry. There exist various physical reasons for the fluctuations 
of the fiber parameters. They may be induced by mechanical distortions on 
fibers in practical use, such as pointlike pressures or twists [21]. They may 
also result from variations of ambient temperature or other environmental 
parameters [2]. 

The difficulty is that PMD is a random phenomenon. Designers want to 
ensure that some exceptional but very annoying event occurs only a very 
small fraction of time. This critical event corresponds to a pulse spreading 
beyond a threshold value. For example, a designer might require that such an 
event occurs less than 1 minute per year [3]. PMD in an optical fiber varies 
with time due to vibrations and variations of environmental parameters. 
The usual assumption is that the fiber passes ergodically through all possible 
realizations. Accordingly requiring that an event occurs a fraction of time p is 
equivalent to requiring that the probability of this event is p. The problem is 
then reduced to the estimation of the probability of a rare event. Typically 
the probability is 10"^ or less [3]. It is extremely difficult to use either 
laboratory experiments or MC simulations to obtain a reliable estimate of 
such a low probability because the number of configurations that must be 
explored is very large. Recently IS has been applied to numerical simulations 
of PMD [2]. This method gives good results; however, it requires very good 
physical insight into the problem because it is necessary for the user to know 
how to produce artificially large pulse widths. We would like to revisit this 
work by applying the IPS strategy. The main advantage is that we do not 
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need to specify how to produce artificially large pulse widths, as the IPS 
will automatically select the good "particles." 

5.2. Review of PMD models. The pulse spreading in a randomly bire- 
fringent fiber is characterized by the so-called square differential group delay 
(DGD) r = |f p. The vector f is the so-called PMD vector, which is solution 
of 

(5.1) r^ = ujn{z) xr + n{z), 

where i}{z) is a three-dimensional zero- mean stationary random process 
modeling PMD. 

5.2.1. The white noise model. Simplified analytical models have been 
studied. In the standard model [12, 13, 20, 27] it is assumed that the process 
$7 is a white noise with autocorrelation function E[r2j(z')r2j(2;)] = a'^6ij6{z' — 
z). In such a case the differential equation (5.1) must be interpreted as a 
stochastic differential equation: 

(5.2) dh = crujrs ° dW^ - aujh o dW^ + a dW] , 

(5.3) dr2 = crujfi o dW^ - ator-s o dW^ + a dW^, 

(5.4) drs = aur2 o dW} - atofi o dW^ + a dW^, 

where o stands for the Stratonovich integral and the W^^s are three indepen- 
dent Brownian motions. It is then possible to establish [12] that the DGD 
r is a diffusion process and in particular that r(ti;, z) obeys a Maxwellian 
distribution if r(0) = (0,0,0)^. More precisely the p.d.f. of t{(jJ,z) is 

5.2.2. Realistic models. The white noise model gives an analytical for- 
mula for the p.d.f. of the DGD, which in turns allows us to compute exactly 
the probability that the DGD exceeds a given threshold value. However, it 
has been pointed out that the p.d.f. tail of the DGD does not fit with the 
Maxwellian distribution in realistic configurations [1] . Various numerical and 
experimental PMD generation techniques involve the concatenation of bire- 
fringent elements with piecewise constant vectors [18]. Equation (5.1) can 
be solved over each segment, and continuity conditions on the segments 
junctions give a discrete model for the PMD vector f . The total PMD vec- 
tor after the (n -|- l)st section can then be obtained from the concatenation 
equation [14] 



(5.5) 
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where a is the DGD per section, 0,n = ^{On) with 

n{9) = {cos{e),sm{6),0f . 

Rn is a matrix corresponding to a rotation through an angle (j)n about the 
axis 0,n- Exphcitly Rn = R{6n,<pn) with 

/ cos2(6i) + sin2(6i)cos((/)) sin(6l) 008(6*) (1 - cos((/))) sin(6') sin(0) \ 
i?(6',(^) = sin(6i)cos(6i)(l -cos(0)) sin^ (6) + cos^ (9) cos{(j)) - cos(6i) sin((/)) . 

y — sin(6') sin(0) cos(6') sin(0) cos((/)) J 

From the probabihstic point of view, the angles (j)n are i.i.d. random vari- 
ables uniformly distributed in (0,27r). The angles On are i.i.d. random vari- 
ables such that cos(0„,) are uniformly distributed in ( — 1, 1) [2]. Accordingly, 
{rn)n£N is a Markov chain. Let us assume that the fiber is modeled as the 
concatenation of n segments and that the outage event is of the form |f„| > a 
for some fixed threshold value a. In the case where a is much larger than the 
expected value of the final DGD |f„|, the outage probability is very small, 
and this is the quantity that we want to estimate. 

5.3. Estimations of outage probabilities. 

5.3.1. Importance sampling. In [2] IS is used to accurately calculate out- 
age probabilities due to PMD. As discussed in the Introduction, the key 
difficulty in applying IS is to properly choose the twisted distribution for 
the driving process {9p,4>p)i<p<n- The papers [2, 11, 17] present different 
twisted distributions and the physical explanations why such distributions 
are likely to produce large DGD's. As a result the authors obtain with 10^ 
simulations good approximations of the p.d.f. tail even for probabilities of 
the order of 10"^^^ The main reported physical result is that the probability 
tail is significantly smaller than the Maxwellian tail predicted by the white 
noise model. 

5.3.2. Interacting particle systems. In this subsection we apply our IPS 
method and compare the results with those obtained by MC and IS. To get 
a reliable estimate of the outage probability of the event, it is necessary to 
generate realizations producing large DGD's. The main advantage of the 
IPS approach is that it proposes a "blink" method that does not require 
any physical insight. Such a method could thus be generalized to more com- 
plicated situations. Here the Markov process is the PMD vector (f„)„gN at 
the output of the nth fiber section. The state space is M^, the initial PMD 
vector is fo = (0,0,0)-^, the Markov transitions are described by (5.5) and 
the energy-like function is V^(f) = |f|. We estimate the p.d.f. p{a) of |f„| by 
implementing the IPS with the two weight functions 

(5.6) G^(f)=exp(/3|fp|) 
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parameterized by /3 > 0, and 

(5.7) G^(f)=exp[a(|fp|-|Vil)] 

parameterized by a > 0. We have implemented Algorithms 1 and 2 as de- 
scribed in Section 3. Before presenting and discussing the results, we would 
like to underline that we have chosen to perform a selection step at the out- 
put of each segment, because the number of segments is not very large. If 
the number of segments were very large, it should be better to perform a 
selection step every two or three or no segments. 

In Figure 3(a) we plot the estimation of the DGD p.d.f. obtained by 
the IPS method with the weight function defined by (5.6). The fiber 
consists in the concatenation of n = 15 segments. The DGD per section is 
a = 0.5. We use a set of A'^ = 2 x 10^ interacting particles. This result can 
be compared with the one obtained in [2], which shows excellent agreement. 
The difference is that our procedure is fully adaptative, it does not require 
any guess of the user, and it does not require to twist the input probability 
density. The variance of estimator of the DGD p.d.f. is plotted in 
Figure 3(b). This figure is actually used to determine the best estimator of 
the DGD p.d.f. by the procedure described at the end of Section 3. 

In Figure 4(a) we plot the estimation of the DGD p.d.f. obtained by the 
IPS method with the weight function defined by (5.7). It turns out that 
the estimated variance of the estimator is smaller with the weight function 
Gp than with the weight function G^ [cf. Figures 4(b) and 3(b)]. This ob- 
servation confirms the theoretical predictions obtained with the Gaussian 
random walk. 




Fig. 3. (a) Segments of the DGD p.d.f. obtained by the usual MC technique (squares) 
and by the IPS with the weight function with /3 = 0.33 (triangles) and /9 = 1 (stars). 
The solid line stands for the Maxwellian distribution obtained with the white noise model. 
The Maxwellian distribution fails to describe accurately the p.d.f. tail, (b); Ratios p2/p. 
The quantity p2 is the standard deviation of the estimator of the DGD p.d.f. In the IPS 
cases, the standard deviations are estimated via the formula (3.17). 




Fig. 4. (a) Segments of the DGD p.d.f. obtained by the usual MC technique (squares) 
and by the IPS with the weight function Gp with a — 2.0 (triangles) and a = 6.0 (stars). 
The solid line stands for the Maxwellian distribution obtained with the white noise model. 
(b) Ratios P2/p- The quantity p2 is the standard deviation of the estimator of the DGD 
p.d.f. In the IPS cases, the standard deviations are estimated via (3.18). 




Fig. 5. Conditional expectations -Da,a+ia(P) of the intermediate DGD atp = 4, 8, 12, 
given that the final DGD lies in the interval {a, a + 5a) with n = 15, Sa — 0.18, and ( from 
top to bottom) a = 7, a = 6.1, a — 5.2. The error bars are obtained from the estimations of 
the conditional variances. 



The IPS approach is also powerful to compute conditional probabilities 
or expectations given the occurrence of some rare event. For instance, we 
can be interested in the moments of the intermediate DGDs given that the 
final DGD lies in the rare set (a, a + 6a): 

Dl,^SaiP^n)=n\rjn^n\e[a,a + 6a)]. 

This information gives us the typical behaviors of the PMD vectors along 
the fiber that give rise to a large final DGD. We use the estimator (2.20) 
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based on the IPS with the weight function (5.7). As shown by Figure 5, the 
typical conditional trajectory of the DGD is close to a linear increase with 
a constant rate given by the ratio of the final DGD over the length of the 
fiber. The conditional variances are found to be small, which shows that 
fluctuations are relatively small around this average behavior. 

REFERENCES 

[1] BlONDlNl, G., Kath, W. L. and Menyuk, C. R. (2001). Non-Maxwellian DGD dis- 
tribution of PMD emulators. In Proc. OFC 2001 ThA5 1-3. 

[2] BiONDiNi, G., Kath, W. L. and Menyuk, C. R. (2002). Importance sampling for 
polarization-mode dispersion. IEEE Photon. Technol. Lett. 14 310-312. 

[3] BuLOW, H. (1999). Limitation of optical first-order PMD compensation. In Proc. 
ECOC 1999 WEI 74-76. 

[4] Del Moral, P. (2004). Feynman~Kac Formulae, Genealogical and Interacting Par- 
ticle Systems With Applications. Springer, New York. MR2044973 

[5] Del Moral, P. and Guionnet, A. (1999). A central limit theorem for nonlin- 
ear filtering using interacting particle systems. Ann. Appl. Probab. 9 275-297. 
MR1687359 

[6] Del Moral, P. and Jacob, J. (2001). Interacting particle filtering with discrete 
observations. In Sequential Monte Carlo Methods in Practice (N. J. Gordon, A. 
Doucet and J. F. G. de Freitas, eds.) 43-75. Springer, New York. MR1847786 

[7] Del Moral, P. and Jacod, J. (2002). The Monte Carlo method for fihering with 
discrete-time observations: Central limit theorems. In Numerical Methods and 
Stochastics, Fields Inst. Commun. 34 29-53. Amer. Math. Soc, Providence, RI. 

[8] Del Moral, P. and Ledoux, M. (2000). On the convergence and the applications 
of empirical processes for interacting particle systems and nonlinear filtering. 
J. Theoret. Probab. 13 225-257. MR1744985 

[9] Del Moral, P. and Tindel, S. (2003). A Berry-Esseen theorem for Feynman-Kac 
and interacting particle models. Publications de I'Institut Elie Cartan 44. Univ. 
Henri Poincare, Nancy. 
[10] Doucet, A., de Freitas, N. and Gordon, N. (2001). Sequential Monte Carlo Meth- 
ods in Practice. Springer, New York. MR1847783 
[11] Fogal, S. L., Biondini, G. and Kath, W. L. (2002). Multiple importance sampling 
for first- and second-order polarization-mode dispersion. IEEE Photon. Technol. 
Lett. 14 1273-1275. 

[12] Garnier, J., Fatome, J. and Le Meur, G. (2002). Statistical analysis of pulse 
propagation driven by polarization-mode dispersion. J. Opt. Soc. Amer. B 19 
1968-1977. 

[13] GiSiN, N., Pellaux, J. P. and Von der Weid, J. P. (1991). Polarization mode 
dispersion for short and long single-mode fibers. J. Lightwave Technol. 9 821- 
827. 

[14] Gordon, J. P. and Kogelnik, H. (2000). PMD fundamentals: Polarization-mode 
dispersion in optical fibers. Proc. Nat. Acad. Sci. U.S.A. 97 4541. 

[15] Harris, T. E. and Kahn, H. (1951). Estimation of particle transmission by random 
sampling. Natl. Bur. Stand. Appl. Math. Ser. 12 27-30. 

[16] Heidelberg, P. (1995). Fast simulation of rare events in queueing and reliability 
models. ACM Transactions on Modeling and Computer Simulation 5 43-85. 



GENEALOGICAL PARTICLE ANALYSIS OF RARE EVENTS 



41 



[17] Lima, I. T., Jr., Biondini, G., Marks, B. S., Kath, W. L. and Menyuk, C. 

R. (2002). Analysis of PMD compensators with fixed DGD using importance 

sampling. IEEE Photon. Technol. Lett. 14 627-629. 
[18] Marcuse, D., Menyuk, C. R. and Wai, P. K. A. (1997). Application of the 

Manakov-PMD equation to studies of signal propagation in optical fibers with 

randomly varying birefringence. J. Lightwave Technol. 15 1735-1746. 
[19] Nagasawa, M. (1991). Stochastic Processes in Quantum Physics. Birkhauser, Boston. 

MR1739699 

[20] Poole, C. D. and Wagner, R. E. (1986). Phenomenological approach to polarization 

dispersion in long single-mode fibers. Electron. Lett. 22 1029-1030. 
[21] Rashleigh, S. C. (1983). Origins and control of polarization effects in single-mode 

fibers. J. Lightwave Technol. 1 312-331. 
[22] Reed, M. and Simon, B. (1975). Methods of Modem Mathematical Physics. IL 

Fourier Analysis, Self Adjointness. Academic Press, New York. MR493420 
[23] ROSENBLUTH, M. N. and Rosenbluth, a. W. (1955). Monte Carlo calculations of 

the average extension of macromolecular chains. J. Chem. Phys. 23 356-359. 
[24] Rubinstein, R. Y. (1981). Simulation and the Monte Carlo Method. Wiley, New 

York. MR624270 

[25] Shorack, G. R. (2000). Probability for Statisticians. Springer, New York. MR1762415 
[26] SzNiTMAN, A. S. (1998). Brownian Motion, Obstacles and Random Media. Springer, 

New York. MR1717054 
[27] Wai, P. K. A. and Menyuk, C. R. (1996). Polarization mode dispersion, decorre- 

lation, and diffusion in optical fibers with randomly varying birefringence. J. 

Lightwave Technol. 14 148-157. 

Labor AToiRE de Mathematiques J. A. Dieudonne Labor atoire de Probabilites 
Universite de Nice et Modeles Aleatoires 

PARC Valrose and Laboratoire Jacques-Louis Lions 

06108 Nice Cedex 2 Universite Paris 7 

France 2 Place Jussieu 

E-MAIL: dclmoral@math.unice.fr 75251 Paris Cedex 5 

France 

E-MAIL: garnicr@math.jussieu.fr 



